Node-based Induction of Tree-Substitution Grammars

نویسنده

  • Rose Sloan
چکیده

Traditionally, syntactic parsing is done using probabilistic context-free grammars (PCFGs) or variants thereof, as there are standard efficient methods for parsing with PCFGs and for extracting them from a corpus. However, PCFGs do not accurately represent many dependencies in natural language. For example, many determiners can only occur with certain types of nouns. Determiners like a and another can only occur with singular count nouns, while those can only precede plural nouns, and determiners like more can precede either plural nouns or mass nouns but not singular count nouns. To represent these dependencies with a PCFG, we must have many separate categories for both determiners and nouns, as a simple rule like NP → DT N (where DT stands for determiner and N stands for noun) will overgenerate noun phrases like a water or those cat. One formalism that makes representing long-range dependencies much simpler is probabilistic tree-substitution grammars (PTSGs). While every CFG rule can be seen as one level of a syntactic parse tree, and thus a subtree of height 2, TSG rules can be any subtree of a syntactic tree, thus allowing them to concisely represent dependencies that involve more than one level of syntactic structure. For example, a TSG for generating noun phrases can represent what types of nouns common determiners can precede using rules like the following: NP

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Native Language Detection with Tree Substitution Grammars

We investigate the potential of Tree Substitution Grammars as a source of features for native language detection, the task of inferring an author’s native language from text in a different language. We compare two state of the art methods for Tree Substitution Grammar induction and show that features from both methods outperform previous state of the art results at native language detection. Fu...

متن کامل

A Python-based Interface for Wide Coverage Lexicalized Tree-adjoining Grammars

This paper describes the design and implementation of a Python-based interface for wide coverage Lexicalized Tree-adjoining Grammars. The grammars are part of the XTAGGrammar project at the University of Pennsylvania, which were hand-written and semi-automatically curated to parse real-world corpora. We provide an interface to the wide coverage English and Korean XTAG grammars. Each XTAG gramma...

متن کامل

Toward Tree Substitution Grammars with Latent Annotations

We provide a model that extends the splitmerge framework of Petrov et al. (2006) to jointly learn latent annotations and Tree Substitution Grammars (TSGs). We then conduct a variety of experiments with this model, first inducing grammars on a portion of the Penn Treebank and the Korean Treebank 2.0, and next experimenting with grammar refinement from a single nonterminal and from the Universal ...

متن کامل

Comparison of XTAG and LEXSYS grammars

We use the Lexicalised D-Tree Grammar (LDTG) formalism (Rambow et al. 95), which is based on the Lexicalized Tree Adjoining Grammar (LTAG) formalism. In LDTG, there are two types of edges between nodes: d-edges, represented with a broken line, and p-edges, represented by a solid line. Trees are combined by two substitution-like operations, both of which involve combining two descriptions, by eq...

متن کامل

Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing

Inducing a grammar directly from text is one of the oldest and most challenging tasks in Computational Linguistics. Significant progress has been made for inducing dependency grammars, however the models employed are overly simplistic, particularly in comparison to supervised parsing models. In this paper we present an approach to dependency grammar induction using tree substitution grammar whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016